Data provenance – the foundation of data quality

نویسندگان

  • Peter Buneman
  • Susan B. Davidson
چکیده

Provenance for data is often defined by analogy with its use for the history of non-digital artifacts, typically works of art. While this provides a starting point for our understanding of the term, it is not adequate for at least two important reasons. First, an art historian seeing a copy of some artifacts will regard its provenance as very different from that of the original. By contrast, when we make use of a digital artifact, we often use a copy, and and copying in no sense destroys the provenance of that artifact. Second, digital artefacts are seldom “raw” data. They are created by a process of transformation or computation on other data sets; and we regard this process as a part of the provenance. For “non-digital” artefacts, provenance is usually traced back to the point of creation, but no further.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Tracking Editing Processes in Volunteered Geographic Information: The Case of OpenStreetMap

With an increasing number of applications building on OpenStreetMap, data quality is becoming a pressing issue. Data provenance gives useful hints that facilitate data quality assessments based on the features’ persistence. However, this requires a detailed analysis of the editing history and the corresponding contributors. In order to make this provenance information explicit, we introduce a p...

متن کامل

Dependency Path Patterns as the Foundation of Access Control in Provenance-aware Systems

A unique characteristics of provenance data is that it forms a directed acyclic graph (DAG) in accordance with the underlying causality dependencies between entities (acting users, action processes and data objects) involved in transactions. Data provenance raises at least two distinct security-related issues. One is how to control access to provenance data which we call Provenance Access contr...

متن کامل

QualityTrails: Data Quality Provenance as a Basis for Sensemaking

Visual Analytics prototypes increasingly support human sensemaking through providing Provenance information. For data analysts the challenge of knowledge generation starts with assessing the quality of a data set, but Provenance is not yet utilized to aid this task. This position paper aims at characterizing the complexity of Visual Analytics methods introducing Provenance in Data Quality by hi...

متن کامل

Provenance Information in the Web of Data

The openness of the Web and the ease to combine linked data from different sources creates new challenges. Systems that consume linked data must evaluate quality and trustworthiness of the data. A common approach for data quality assessment is the analysis of provenance information. For this reason, this paper discusses provenance of data on the Web and proposes a suitable provenance model. Whi...

متن کامل

Provenance Traces

Provenance is information about the origin, derivation, ownership, or history of an object. It has recently been studied extensively in scientific databases and other settings due to its importance in helping scientists judge data validity, quality and integrity. However, most models of provenance have been stated as ad hoc definitions motivated by informal concepts such as “comes from”, “influ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010